Modify ori file reader to deal with absence of EN (or not) at the end of the file by GallegoSav · Pull Request #516 · cositools/cosipy

GallegoSav · 2026-03-13T20:18:49Z

This pr is for solving #503 .

If there is EN at the end of the ori file, dropna will remove it. If not , nothing will change

Updated data processing to drop NaN values from the dataframe.

codecov · 2026-03-13T20:25:45Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 71.18%. Comparing base (4911568) to head (61429a9).
⚠️ Report is 29 commits behind head on develop.

Files with missing lines	Coverage Δ
cosipy/spacecraftfile/spacecraft_file.py	`85.08% <100.00%> (ø)`

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

jdbuhler · 2026-03-13T20:40:18Z

If the issue is that one particular ori file is malformed because the expected EN got dropped, why are we not fixing the file in question, instead of complicating the parser in a way that invites future bugs?

jdbuhler · 2026-03-14T05:22:05Z

Further comments:

Of the three .ori files used in the tutorials,

20280301_3_month... has a trailing EN line
DC3_final_...15sbins... has no trailing EN (but does have a blank line at the end, which the CSV parser ignores) and so was having the last line dropped
DC3_final_...1sbins... has neither a trailing EN nor a blank and so was having the last line dropped

The test case .ori file I recently extracted from DC3_final_...1sbins... follows its format and so also lacks any trailing marker. The other three test case .ori files all have an EN marker.

So before trying to fix the code, can we clarify what we consider the correct .ori format? Can MEGAlib itself ever produce an .ori file without the EN marker? If not, I would argue that we should fix the two files we are using without such markers and then have cosipy throw an error if the EN is missing. We could even expend a bit of extra time to check for the EN explicitly, rather than relying on an indirect test with the CSV parser, since .ori is no longer our primary input format now that we have FITS.

In any case, I clearly need to regenerate the FITS file for _1sbins and _15sbins so they have the last line, and also to fix that one test case .ori file.

jdbuhler · 2026-03-15T01:24:26Z

Another comment on how we parse .ori files: the basic parsing call is

df = pd.read_csv(file, sep=r"\s+", skiprows=1, usecols=tuple(range(1,10)), header = None, comment = '#')

Do we actually want to support commenting out lines with '#'? Is this a feature of MEGAlib's ori files? If not, I would argue that it is more likely to cause unexpected behavior than to be useful.

israelmcmc · 2026-03-16T15:04:42Z

So before trying to fix the code, can we clarify what we consider the correct .ori format? Can MEGAlib itself ever produce an .ori file without the EN marker? If not, I would argue that we should fix the two files we are using without such markers and then have cosipy throw an error if the EN is missing. We could even expend a bit of extra time to check for the EN explicitly, rather than relying on an indirect test with the CSV parser, since .ori is no longer our primary input format now that we have FITS.

I know I was the one who suggested the workaround that @GallegoSav implemented in this PR (thanks @GallegoSav btw). I was trying to save some time, but @jdbuhler, if you are willing to fix the files (it seem you already did #517, correct?) and add a check for EN, then I agree and we can go with your solution instead.

Do we actually want to support commenting out lines with '#'? Is this a feature of MEGAlib's ori files?

I don't think MEGALib comments out lines with # for .ori files specifically (it does for other types of files) but we sometimes add information on the files with # after the fact manually (for e.g. caveats, provenance, etc.)

jdbuhler · 2026-03-16T16:33:22Z

Yes, I fixed the two offending tutorial .ori files (which are now in the develop tree on wasabi). Are we good with replacing the existing files in DC3 with these?

I can work on the explicit EN check outside the framework of CSV parsing -- I'll just seek to the end of the file and look there.

israelmcmc · 2026-03-16T19:29:17Z

Are we good with replacing the existing files in DC3 with these?

My preference is to leave those as is, and only fix the develop and DC4 folders. For DC3 I try to do hot fixes only when absolutely necessary. In this case the error from missing the last line is pretty small, and it doesn't prevent the code from running. Otherwise we would have to change the checksum of the release as well, not just develop. I'm also tagging @ckarwin to see what he thinks.

I can work on the explicit EN check outside the framework of CSV parsing -- I'll just seek to the end of the file and look there.

Sounds good, thank you.

jdbuhler · 2026-03-21T01:07:57Z

@israelmcmc and @GallegoSav , please see PR #533.

Modify spacecraft_file.py to drop NaN values

81ebf12

Updated data processing to drop NaN values from the dataframe.

GallegoSav assigned jdbuhler and scipascal Mar 13, 2026

GallegoSav added 2 commits March 13, 2026 21:20

Transpose values after dropping NaN entries

683d1f2

Change dropna() to return values in transpose

61429a9

GallegoSav changed the title ~~Modify ori file reader to deal with absence of EN (or not) at the end orf the file~~ Modify ori file reader to deal with absence of EN (or not) at the end of the file Mar 13, 2026

israelmcmc mentioned this pull request Mar 16, 2026

Use fixed FITS reponse files to address issue #503 #517

Merged

jdbuhler mentioned this pull request Mar 21, 2026

Explicitly check for "EN" at end of .ori files #533

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Modify ori file reader to deal with absence of EN (or not) at the end of the file#516

Modify ori file reader to deal with absence of EN (or not) at the end of the file#516
GallegoSav wants to merge 3 commits intocositools:developfrom
GallegoSav:issue_503

GallegoSav commented Mar 13, 2026

Uh oh!

codecov bot commented Mar 13, 2026 •

edited

Loading

Uh oh!

jdbuhler commented Mar 13, 2026

Uh oh!

jdbuhler commented Mar 14, 2026 •

edited

Loading

Uh oh!

jdbuhler commented Mar 15, 2026

Uh oh!

israelmcmc commented Mar 16, 2026

Uh oh!

jdbuhler commented Mar 16, 2026

Uh oh!

israelmcmc commented Mar 16, 2026 •

edited

Loading

Uh oh!

jdbuhler commented Mar 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

GallegoSav commented Mar 13, 2026

Uh oh!

codecov bot commented Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

jdbuhler commented Mar 13, 2026

Uh oh!

jdbuhler commented Mar 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jdbuhler commented Mar 15, 2026

Uh oh!

israelmcmc commented Mar 16, 2026

Uh oh!

jdbuhler commented Mar 16, 2026

Uh oh!

israelmcmc commented Mar 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jdbuhler commented Mar 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

codecov bot commented Mar 13, 2026 •

edited

Loading

jdbuhler commented Mar 14, 2026 •

edited

Loading

israelmcmc commented Mar 16, 2026 •

edited

Loading